Video description: A comprehensive survey of deep learning approaches

نویسندگان

چکیده

Abstract Video description refers to understanding visual content and transforming that acquired into automatic textual narration. It bridges the key AI fields of computer vision natural language processing in conjunction with real-time practical applications. Deep learning-based approaches employed for video have demonstrated enhanced results compared conventional approaches. The current literature lacks a thorough interpretation recently developed sequence techniques description. This paper fills gap by focusing mainly on deep learning-enabled caption generation. Sequence models follow an Encoder–Decoder architecture employing specific composition CNN, RNN, or variants LSTM GRU as encoder decoder block. standard-architecture can be fused attention mechanism focus distinctiveness, achieving high quality results. Reinforcement learning within structure progressively deliver state-of-the-art captions following exploration exploitation strategies. transformer is modern efficient transductive robust output. Free from recurrence, solely based self-attention, it allows parallelization along training massive amount data. fully utilize available GPUs most NLP tasks. Recently, emergence several versions transformers, long term dependency handling not issue anymore researchers engaged summarization description, autonomous-vehicle, surveillance, instructional purposes. They get auspicious directions this research.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches

Deep learning has demonstrated tremendous success in variety of application domains in the past few years. This new field of machine learning has been growing rapidly and applied in most of the application domains with some new modalities of applications, which helps to open new opportunity. There are different methods have been proposed on different category of learning approaches, which inclu...

متن کامل

A Comprehensive Survey of Data Processing Approaches

In Wireless Sensor Networks (WSNs), energy of the sensor nodes are the most concerning factor because each sensor is equipped with limited amount of energy which is used to perform lots of work. Sensors have capabilities of sensing, analyzing, processing and communication of the data. In WSNs, the most responsible factor behind energy consumption of the sensor nodes is Data Processing which mai...

متن کامل

A Comprehensive Survey of Video Encryption Algorithms

Encryption is the widely used technique to offer security for video communication and considerable numbers of video encryption algorithms have been proposed. The paper explores the literature for already proposed video encryption algorithms with the focus on the working principle of already proposed video encryption schemes. This study is aimed to give readers a quick overview about various vid...

متن کامل

a structural survey of the polish posters

تصویرسازی قابلیتهای فراوانی را دارا است

15 صفحه اول

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Artificial Intelligence Review

سال: 2023

ISSN: ['0269-2821', '1573-7462']

DOI: https://doi.org/10.1007/s10462-023-10414-6